An Active Learning Algorithm Based on Existing Training Data

نویسندگان

  • Hiroyuki TAKIZAWA
  • Taira NAKAJIMA
  • Hiroaki KOBAYASHI
  • Tadao NAKAMURA
چکیده

A multilayer perceptron is usually considered a passive learner that only receives given training data. However, if a multilayer perceptron actively gathers training data that resolve its uncertainty about a problem being learnt, sufficiently accurate classification is attained with fewer training data. Recently, such active learning has been receiving an increasing interest. In this paper, we propose a novel active learning strategy. The strategy attempts to produce only useful training data for multilayer perceptrons to achieve accurate classification, and avoids generating redundant training data. Furthermore, the strategy attempts to avoid generating temporarily useful training data that will become redundant in the future. As a result, the strategy can allow multilayer perceptrons to achieve accurate classification with fewer training data. To demonstrate the performance of the strategy in comparison with other active learning strategies, we also propose an empirical active learning algorithm as an implementation of the strategy, which does not require expensive computations. Experimental results show that the proposed algorithm improves the classification accuracy of a multilayer perceptron with fewer training data than that for a conventional random selection algorithm that constructs a training data set without explicit strategies. Moreover, the algorithm outperforms typical active learning algorithms in the experiments. Those results show that the algorithm can construct an appropriate training data set at lower computational cost, because training data generation is usually costly. Accordingly, the algorithm proves the effectiveness of the strategy through the experiments. We also discuss some drawbacks of the algorithm. key words: multilayer perceptrons, the network inversion algorithm, active learning, classification problems

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inverted Pendulum Control Using Negative Data

   In the training phase of learning algorithms, it is always important to have a suitable training data set. The presence of outliers, noise data, and inappropriate data always affects the performance of existing algorithms. The active learning method (ALM) is one of the powerful tools in soft computing inspired by the computation of the human brain. The operation of this algorithm is complete...

متن کامل

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

ارائه الگوریتمی مبتنی بر یادگیری جمعی به منظور یادگیری رتبه‌بندی در بازیابی اطلاعات

Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank has been shown to be useful in many applications of information retrieval, natural language processing, and data mining. Learning to rank can be described by two systems: a learning system and a ranking system. The learning system takes training data as input and constructs a ranking ...

متن کامل

An Evolutionary Multi-objective Discretization based on Normalized Cut

Learning models and related results depend on the quality of the input data. If raw data is not properly cleaned and structured, the results are tending to be incorrect. Therefore, discretization as one of the preprocessing techniques plays an important role in learning processes. The most important challenge in the discretization process is to reduce the number of features’ values. This operat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000